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BROWSING  THROUGH 
TERABYTES 


Wide-area  information  servers  open  a  new  frontier  in  personal  and  corporate  information  services 

RICHARD  MARLON  STEIN 


The  Library  of  Congress  archives 
roughly  25  terabytes  in  its  collec- 
tion. To  browse  through  this  vol- 
ume on  your  own  would  be  nearly 
impossible.  Wide-area  information  serv- 
ers supply  the  means  to  achieve  this  goal 
by  providing  the  user-interface  structure 
and  underlying  information-retrieval 
protocol  necessary  to  automatically  col- 
late, collect,  and  integrate  diverse  data 
streams.  WAISes  can  distill  the  contents 
of  vast  archives  into  neatly  manageable 
and  browsable  folders. 

On-line  information  services,  such  as 
BIX  and  CompuServe,  attest  to  the  need 
for  this  kind  of  technology.  Information 
has  acquired  a  commodity-like  status. 
While  not  on  a  par  with  wheat,  pork  bel- 
lies, or  gold  futures,  the  information-ser- 
vice industry  fills  a  vital  role.  The  next 
phase  of  information  commerce  will  add 
WAIS  capabilities  to  existing  on-line  ser- 
vices, opening  a  new  frontier  in  personal 
and  corporate  information  services. 

Intentions  and  Goals 

Initiated  in  early  1989,  the  WAIS  engi- 
neering effort  is  spearheaded  by  Think- 
ing Machines  (Cambridge,  MA),  the 
manufacturer  of  the  Connection  Ma- 
chine, a  massively  parallel  supercom- 
puter (see  reference  1).  The  principal 
goal  of  the  research  project  is  to  demon- 
strate "how  current  technology  can  be 
used  to  open  a  market  of  information  ser- 
vices that  will  allow  a  user's  workstation 
to  act  as  librarian  and  information  col- 
lection agent  from  a  large  number  of 
sources."  (See  reference  2.)  WAISes  aim 
to  enhance  existing  information  services 
and  provide  a  utilitarian  mechanism  for 
the  industry. 
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Information  servers  already  provide 
direct  access  to  many  databases  and  ar- 
chive structures.  You  can  easily  check 
the  local  weather,  make  travel  reserva- 
tions, obtain  entertainment  schedules,  or 
browse  through  the  latest  stock-market 
quotes  on-line.  These  services  are  highly 
interactive,  charging  users  on  the  basis  of 
minutes  spent  on-line,  and  each  has  a 
unique  user  interface. 

WAISes  alleviate  unnecessary  user  in- 
teraction through  a  predominantly  com- 
puter-to-computer approach  to  remote 
information  retrieval.  By  minimizing 
human  interaction  with  a  remote  infor- 
mation server,  they  handle  requests  for 
information  expeditiously  and  inexpen- 
sively. WAISes  also  alleviate  unneces- 
sary complexity  by  moving  all  user  inter- 
action to  the  local  workstation  and  by 
having  WAIS  software  handle  all  trans- 
actions with  the  remote  server. 

On-line  servers  are  limited  in  their 
connectivity.  While  many  services,  such 
as  BIX,  CompuServe,  and  AppleLink, 
incorporate  wide-area  network  struc- 
tures, sharing  information  between  dif- 
ferent services  is  not  a  wholly  transpar- 
ent option.  This  restriction  constrains 
information  commerce  and  hampers  the 
circulation  of  potentially  useful  ideas. 

WAISes  circumvent  this  barrier  with  a 
standard  information-exchange  protocol 

The  next  phase  of  informa- 
tion commerce  will  add  wide- 
area  information  server  capa- 
bilities to  existing  on-line 
services.  WAISes  provide  the 
user-interface  structure  and 
the  underlying  information- 
retrieval  protocol  necessary 
to  automatically  collate,  col- 
lect, and  integrate  informa- 
tion from  varioLis  sources. 
When  these  are  implement- 
ed, you  should  be  able  to  di- 
rectly access  such  sources 
as  the  Library  of  Congress 
and  the  myriad  of  newspa- 
pers, journals,  and  books. 


that  offers  unlimited  connectivity  and  re- 
trieval functionality.  All  servers  can  ap- 
ply the  WAIS  protocol  to  their  archive 
structures  to  conduct  information  re- 
trieval. (Unlimited  connectivity  also 
raises  concerns  of  security  and  privacy. 
See  the  text  box  "The  Right  to  Privacy" 
on  page  160.) 

Organized  and  coherent  information 
of  topical  importance  has  value.  Individ- 
uals and  companies  should  be  able  to 
market  their  information  to  the  widest 
possible  audience.  Current  on-line  ser- 
vices can't  easily  accomplish  this,  since 
their  connectivity  is  restricted. 

To  direct  your  information  to  the  best 
marketplace,  you  could  subscribe  to  mul- 
tiple on-line  sources  and  post  the  same 
message  on  all  of  them.  But  it  would  be 
more  efficient  to  post  the  data  on  one 
server  and  have  the  data,  or  an  abstract 
of  it,  broadcast  to  the  others.  Using  the 
WAIS  protocol,  WAISes  facilitate  this 
server  function. 

Suppose,  for  example,  you  have  re- 
viewed the  latest  set  of  RISC  micropro- 
cessor benchmarks,  taking  note  of  spe- 
cific architectural  advantages,  and  you 
wish  to  make  this  information  available 
to  others.  The  benchmark  review  is  kept 
on  your  home  computer  (i.e.,  the  local 
WAIS),  which  is  equipped  with  WAIS 
technology.  The  nearest  remote  WAIS,  a 
hub  within  a  network  of  servers,  also  has 
a  folder  for  RISC  microprocessors.  So 
you  make  a  posting  to  the  nearest  hub 
server  that  inserts  a  pointer  to  the  review 
on  your  home  computer. 

Everyone  with  a  computer  running  the 
WAIS  user-interface  software  can  pre- 
sent information  to  a  server  and  receive 
compensation  for  whatever  portion  of  it 
other  WAIS  subscribers  access.  The 
compensation  can  be  monetary,  or  you 
can  barter  your  information  for  someone 
else's. 

Even  publishers  of  books,  magazines, 
newspapers,  and  music  can  participate 
and  profit  from  WAISes.  For  example, 
how  much  money  could  a  newspaper  save 
in  circulation  costs  if  you  received  the 
morning  paper  electronically  instead  of 
printed  on  paper?  Similarly,  how  much 
money  could  a  book  publisher  save  if  you 
purchased  a  new  best-selling  novel  elec- 
tronically instead  of  at  a  bookstore? 

Traditional  information  delivery  is  ex- 
pensive, and  costs  are  rising.  The  U.S. 
Postal  Service  frequently  raises  its  fees 
to  cover  increases  in  the  cost  of  handling 
and  transportiiig  information.  Tradition- 
al information  transport  also  represents  a 
significant  fraction  of  transport  volume 
and  collateral  energy  consumption. 
Moving  information  electronically  can 


result  in  enormous  savings. 

Computer  networks  such  as  Internet 
are  conduits  of  information  transport.  To 
replace  manual  transportation  methods, 
the  existiiig  electronic  infrastructure 
must  accommodate  the  newly  anticipated 
volume  of  traffic.  Plans  for  "a  national 
network  of  data  superhighways,"  which 
will  be  installed  within  the  next  few 
years,  are  under  way  (see  references  3 
and  4). 

A  principal  motivation  for  WAIS  tech- 
nology is  to  be  able  to  retrieve  topical  in- 
formation for  research  or  investigation, 
not  just  to  deliver  consumable  items  like 
newspapers  or  books.  Toward  this  end, 
WAISes  rely  on  a  novel  structure  for  in- 
formation retrieval,  the  dynamic  folder. 

To  use  a  WAIS,  you  formulate  a  ques- 
tion (see  figure  1),  find  the  information 
servers  that  provide  satisfactory  re- 
sponses, and  create  a  dynamic  folder. 
The  purpose  of  the  dynamic  folder  is  to 
constantly  or  periodically  update  its  con- 
tents with  new  material  on  the  subject. 

Formulating  a  question  is  natural  to  us 
all.  The  difficult  part  is  locating  the  per- 
tinent information  to  answer  it.  Manual- 
ly locating  the  information  can  be  labori- 
ous and  tedious.  WAISes  automate  the 
search-and-retrieval  process.  To  deter- 
mine which  servers  hold  the  information 
most  pertinent  to  your  question,  and 
where  you  should  submit  dynamic  fold- 
ers, you  may  want  to  consult  server  di- 
rectories. 

Server  Directories 
WAIS  directories  are  servers  that  sup- 
port a  directory-services  function.  They 
are  indexes  to  other  services  within  the 
WAIS  network  and  are  organized  to  help 
you  locate  information.  Like  telephone- 
directory  services,  WAIS  directories  list 
pointers  to  servers,  which  are  grouped 
according  to  content  and  function. 

A  directory-entry  header  contains  suf- 
ficient data  to  describe  the  service,  such 
as  an  English-language  description  of  the 
server,  the  parent  server  (if  the  server  is 
a  subsidiary  of  a  larger  one),  related 
servers,  contact  information  (including 
networks  and  human-interface  points), 
and  cost  information. 

The  local  workstation,  when  equipped 
with  a  WAIS,  should  maintain  a  direc- 
tory entry  that  includes  the  directory- 
entry  header,  a  locally  determined  rank, 
subscription  information  (if  any),  user 
comments,  and  the  time  of  last  contact. 
You  can  use  this  information  to  decide 
whether  to  contact  the  server  and  how  to 
handle  the  responses. 

By  using  content  navigation,  you  can 
find  the  most  appropriate  server  to 
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International  Business  Machines  Corp.,  Apple  Computer  Inc 
and  other  big  computer  makers  are  staking  out  positions  in 
the  nascent  market  for  "note -pad  computers,"  small  machines 
that  let  users  enter  data  by  vriti ng  rather  than  tapping 
keys.  The  note  pads  typically  recognize  numbers  and  letters 
printed  on  a  screen  with  a  special  pen  and  convert  them  into 
conventional  electronic  characters.  The  information  is  then 
stored  for  later  transfer  to  a  personal  computer  or  a 
company's  main  computers. 

The  size  of  the  market  for  note- pad  computers  isn't  clear 
but  Infoeorp,  a  Santa  Clara,  Calif.,  market- research  firm, 
estimates  the  market  will  grow  to  3.4.million  units  sold  in 
1  995  from  22,000  units  this  year.  Only  one  company,  Tandy 
Corp. '3  Grid  Systems  unit,  currently  sells  note- pad  computiTS 
in  the  U.S.;  its  model.  Introduced  last  September,  is  priced 
at  $3,000.  But  new  venturesare  expected  to  Introduce  several 
note-pad  machines  this  year.  And  already,  big  computer  makers 
are  fighting  quietly  for  control  over  software  standards  for 
these  gadgets,  which  require  different  programs  from  those 
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The  Right  to  Privacy 


WAIStation,  a  prototype  user 
interface  developed  by  the 
Thinking  Machines  wide- 
area  information  server  proj- 
ect staff,  embodies  many  functional  as- 
pects of  WAIS  technology.  Forming 
and  refining  queries  via  relevance  feed- 
back, server  selection,  and  dynamic 
folders  are  the  principal  features  that 
this  prototype  supports.  These  assets 
provide  a  powerful  tool  set  for  infor- 
mation retrieval.  While  WAIStation 
achieves  several  desirable  technical 
goals,  the  security  and  privacy  issues 
have  not  yet  received  serious  attention 
and  need  refinement. 

Security  and  privacy  issues  are  not 
specific  to  WAIStation  or  WAISes  in 
general,  but  are  endemic,  topical  con- 
cerns of  the  information-retrieval  in- 
dustry as  a  whole.  WAIS  technology 
seeks  to  extend  connectivity  through  the 
WAIS  protocol,  thus  intensifying  the 
urgency  of  security  measures  and  stan- 
dards. Greater  connectivity  promotes 
information  commerce,  but  it  also  adds 
to  the  risk  of  compromising  the  privacy 
and  confidentiality  of  electronic  trans- 
actions. 

Individuals  and  corporations  that 
subscribe  to  WAISes  must  safeguard 
proprietary  information.  The  tendency 
to  organize  information  within  a  com- 
puter for  ease  of  access  or  to  act  as  a 
convenient  archive  creates  a  security 


and  privacy  dilemma.  And  if  the  sensi- 
tive data  is  located  on  a  machine  with 
high  connectivity,  the  risk  is  multi- 
plied. 

A  WAIStation  that  holds  personal  in- 
formation, such  as  tax  forms,  diaries, 
business  transactions,  medical  records, 
or  bank  accounts,  must  be  protected 
from  intrusion  by  unauthorized  individ- 
uals. A  computer  system  storing  this 
information  "knows"  more  about  you 
than  you  can  instantly  recall.  Access  to 
this  personal  data  must  be  protected, 
controlled,  and  limited  to  authorized 
individuals. 

The  WAIS  protocol  is  an  application- 
layer  protocol  that  runs  over  X.25  com- 
munications, modems,  or  IEEE  802.3 
(Ethernet)  backbones.  Residing  beneath 
this  protocol  is  the  WAIStation  host 
computer  and  operating  system.  Ex- 
tracting information  from  the  server  de- 
pends on  access  granted  through  a  rec- 
ognition and  authentication  system  that 
the  host  computer  operates.  Only  autho- 
rized subscribers  can  access  informa- 
tion from  the  server. 

The  WAIS  protocol  is  stateless,  so 
each  transaction,  whether  a  query  or 
document-retrieval  process,  exists  in  a 
separate  context  at  the  server.  Subver- 
sion of  the  WAIS  protocol,  whether  in- 
tentional or  accidental,  might  unlock  or 
bypass  a  server's  native  file-system  pro- 
tection structure.  If  it  did,  the  entire 


archive  contents  would  be  available  to 
the  intruding  party. 

The  WAIS  protocol  should  be  noncor- 
ruptible  and  should  detect  privileged 
transactions  (i.e.,  those  data  streams 
that  possess  restricted  command  se- 
quences). However,  to  be  effective  as  a 
noncorruptible  application-layer  pro- 
tocol, the  underlying  computer  system 
must  also  be  unbreachable. 

Unfortunately,  you  cannot  always 
guarantee  protection.  In  1988,  a  virus 
introduced  through  a  known  port  as- 
saulted computer  systems  attached  to 
Internet.  Subsequent  sleuthing  discov- 
ered that  a  remote  system  could  activate 
the  debug  mode  of  the  Unix  mailer, 
forcing  the  instigator  into  a  privileged 
state.  The  debug  mode  then  permitted 
the  virus  to  propagate  and  multiply. 

Can  a  rogue  dynamic  folder,  fash- 
ioned after  the  Internet  virus,  intention- 
ally access  information  from  strategic 
servers  running  WAIS  software?  How 
will  WAISes  safeguard  information 
against  illegal  intrusion? 

The  right  to  privacy  is  inalienable, 
and  WAIS  technology  or  any  enabling 
system  that  promotes  information  com- 
merce must  preserve  it.  A  cautionary 
approach  toward  implementating  WAIS 
technology  is  necessary  and  appropri- 
ate. Several  legal  issues  must  be  ad- 
dressed to  secure  both  privacy  and  fair 
business  practice. 


handle  a  query.  For  example,  a  question 
on  RISC  microprocessor  benchmarks 
would  list  directory  entries  for  servers  as 
well  as  pointers  to  articles  on  the  subject. 
When  you  retrieve  a  document,  the  di- 
rectory entry  is  also  provided.  Thus,  you 
obtain  ranking  information  for  questions 
of  similar  content. 

Each  server,  then,  contains  informa- 
tion of  value  to  certain  subscribers.  The 
dynamic  folder  can  continuously  poll 
newspaper  servers  for  new  articles  as 
they  arrive  from  the  news  wires,  while  it 
would  probably  query  a  dictionary  or  en- 
cyclopedia server  only  once,  since  the 
content  changes  much  less  frequently. 

Policing  the  large  number  of  anticipat- 
ed servers  (in  the  tens  of  thousands)  re- 
quires an  independent  quality-control 


mechanism.  An  audit  of  the  server  direc- 
tory would  reflect  any  server  that  fre- 
quently returns  erroneous  information  or 
does  not  perform.  An  independent  agen- 
cy like  Consumer  Reports,  the  Better 
Business  Bureau,  or  other  watchdog 
groups  could  create  rating  servers,  which 
monitor  and  rate  other  servers  in  the 
directory. 

These  rating  servers  resemble  movie 
and  TV  critics.  Consumers  acquire  con- 
fidence in  the  reports  and  reviews  that 
certain  critics  issue  because  they  share 
similar  tastes.  Just  as  moviegoers  start 
to  trust  a  particular  reviewer  who  has 
agreed  with  them  on  past  movies,  WAIS 
users  will  begin  to  trust  the  specific  rat- 
ing services  that  agree  with  them. 

A  subscriber  base  generates  income 


for  a  server.  The  rating  servers  will  at- 
tract subscribers  as  well,  for  they  direct 
trends  in  the  information  marketplace.  In 
fact,  they  may  become  the  first  "infor- 
mation speculators"  as  a  by-product  of 
WAIS  technology. 

Dynamic  Folders 

A  folder,  like  those  found  on  the  Macin- 
tosh, provides  the  WAIS  framework  for 
organizing  questions.  A  folder  is  a  re- 
pository for  documents.  A  file  system,  in 
the  Macintosh  sense,  is  full  of  folders 
organized  in  a  tree  structure  that  sup- 
ports an  efficient  document-location 
mechanism. 

To  find  a  document  within  a  file  sys- 
tem, you  typically  use  the  find  com- 
mand under  Unix  or  Finder  on  the  Mac. 
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Finally,  a  fast,  powerful  text  editor  that  integrates  your  favorite 

programming  tools  and  uses  no  memory! 


Sclmt  desired  option 


The  new  VEDIT family  of  text  editors  offers  stunning  performance, 
versatility  and  ease  of  use.  Completely  written  in  assembly  lan- 
guage, they  are  small  and  lightning  fast.  (3  to  30  times  faster  than 
other  editors  on  large  files  where  speed  really  counts.)  Edit  text 
and  binary  files  of  any  size,  even  1 00+  megabytes.  Installation  is 
trivial;  VEDIT.EXE  and  an  optional  help  file  are  all  you  need  -  no 
overlays,  no  configuration  files,  no  environment  variables. 

For  programmers,  the  new  compiler  support  in  VEDIT  and  VEDIT 
PLUS  is  a  breakthrough.  Run  not  only  popular  compilers,  but 
debuggers  and  your  favorite  tools  from  within  the  editor.  When 
shelling  to  DOS,  VEDIT  swaps  itself  and  any  desired  TSRs  out  of 
memory  to  give  you  more  memory  than  when  you  entered  VEDIT. 
Only  VEDIT  offers  you  the  advantages  of  a  powerful  editor  without 
giving  up  the  convenience  of  an  integrated  environment. 

Call  for  your  free,  fully  functional,  evaluation  copy  today.  See  why 
VEDIT  has  been  the  choice  of  1 00,000  programmers,  writers  and 
engineers  since  1980. 

VEDiT  Jr.  -  Unmatched  performance  for  only  $29. 

All  VEDIT  editors  include  a  pull-down  menu  system  with  "hot 
keys",  context  sensitive  help,  pop-up  status  and  ASCII  table,  a 
configurable  keyboard  layout  and  flexible,  unlimited  keystroke 
macros.  Perform  block  operations  by  character,  line,  file  or 
column.  Undo  up  to  1 000  keystrokes  -  keystroke  by  keystroke, 
line  by  line,  or  deletion  by  deletion.  Automatic  indent,  block  indent 
and  parentheses  matching  speed  program  development.  Word 
wrap,  paragraph  formatting,  justification,  centering,  adjustable 
margins  and  printing  for  word  processing.  Run  DOS  programs. 

VEDIT  -  A  best  value  at  only  $69. 

VEDIT  can  simultaneously  edit  up  to  36  files  and  split  the  screen 
into  windows.  Search/replace  with  regular  expressions.  The  most 
integrated  compiler  support  available.  Run  VEDIT  PLUS  macros. 

VEDIT  PLUS  -  Ultimate  programmer's  tool  for  only  $185. 

VEDIT  PLUS  adds  the  most  powerful  macro  programming  lan- 
guage of  any  editor.  It  eliminates  repetitive  editing  tasks  and 
permits  creating  your  own  editing  functions.  The  macro  language 
includes  testing,  branching,  looping,  user  prompts,  keyboard 
input,  string  and  numeric  variables,  complete  control  over  win- 
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An  intuitive  user  interface  with 
pull  down  menus,  hot  keys, 
mouse  support  and  context 
sensitive  help  make  VEDIT 
easy  to  use,  easy  to  learn. 
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the  keyboard  layout,  to  the 
screen  colors,  to  the  way  con- 
trol characters,  tabs  and  the 
end  of  lines  are  displayed. 
Configure  VEDIT  with  easy  to 
use  menus. 
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Technology:  Computer  Firms  See  the  lUnting 

Computer  makers  are  scrambling  to  cash  in  on  people  who 
t  nd  the  pen  mightier  than  the  keyboard. 

nternational  Business  Machines  Corp.,  Apple  Computer  Inc. 
1  id  other  big  computer  makers  are  staking  out  positions  in 
ij  nascent  market  for  "note- pad  computers,"  small  machine^ 
ii  let  users  enter  data  by  writing  rather  than  tapping 
is.  The  note  pads  typically  recognize  numbers  and  letters 
ilted  on  a  screen  with  a  special  pen  and  convert  them  into 
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Figure  2:  The  similar  %o  function  lets  you  retrieve  more  documents  on  notepad 
computers  using  relevance  feedback.  You  then  might  initiate  a  search  for  additional 
documents  with  similar  content.  Selecting  text  from  a  section  of  a  retrieved  document 
helps  to  refine  subject-matter  searches  or  locate  collateral  information.  You  can  also 
use  the  selected  text  to  execute  a  new  query.  (Courtesy  of  Thinking  Machines  Corp.) 


With  one  of  these  tools,  you  can  locate 
the  position  of  a  file  and  gain  access  to  its 
contents.  Path-driven  locators  search  an 
information  base  for  a  document's  name, 
but  they  do  not  provide  a  means  to  exam- 
ine its  contents. 

Retrieving  documents  pertinent  to  a 
specific  question  requires  content  navi- 
gation (i.e.,  examining  the  contents  of  a 
document,  or  a  representative  abstract  or 
index  for  the  document,  for  its  relevance 
to  the  question).  The  similarity  between 
the  question  and  the  document's  index 
determines  a  retrieval  score,  an  indica- 
tion of  the  likelihood  that  the  document 
is  pertinent. 

WAISes  rely  on  the  dynamic  folder  to 
encapsulate  a  question.  In  its  most  pas- 
sive form,  it  contains  a  question  and  a  set 
of  servers  to  target.  The  WAIS  posts  the 
dynamic  folder  to  servers  of  known  qual- 
ity and  functionality,  and  then  query 
processing  begins. 

The  dynamic  folder  executes  a  remote 
query  that  sends  questions  to  the  remote 
servers.  There  the  questions  find  rele- 
vant information  and  return  a  list  of  doc- 
ument titles  (document  pointers)  encap- 
sulated within  the  originating  folder  to 
the  local  WAIS  system.  The  results  from 


the  query  may  initially  include  a  list  of 
documents  with  fair,  good,  or  high 
similarities. 

Now  you  can  refine  your  query  strate- 
gy by  perusing  the  document  titles  to  de- 
termine which  are  the  most  appropriate 
documents.  WAIS  technology,  in  the 
form  of  the  WAIStation  user  interface 
(see  reference  5),  assists  this  process 
through  a  content-associativity  function 
known  as  similar  to. 

The  similar  to  function  informs  the 
WAIS  user  interface  that  a  document  is 
"interesting. "  The  server  uses  this  infor- 
mation to  find  other  documents  that  are 
similar  to  the  one  you  have  chosen.  This 
search  strategy,  an  embedded  compo- 
nent of  WAISes,  represents  a  significant 
improvement  over  traditional  database 
methods,  such  as  Structured  Query  Lan- 
guage (SQL)  and  Boolean  search. 

This  form  of  query  execution  is  known 
as  relevance  feedback.  It  lets  you  extend 
the  query  to  incorporate  a  "more-like- 
that-one"  functionality  and  lets  you  re- 
trieve documents  that  have  similar  con- 
tents. The  WAIS  user  interface  is 
organized  around  the  English  language, 
and  English-language-oriented  query 
structures  are  easier  to  use  than  SQL. 
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The  similar  to  function  is  like  work- 
ing with  a  reference  librarian.  First,  you 
state  the  topic  of  your  research,  which  the 
librarian  translates  into  queries.  After 
you  examine  the  results  of  the  queries, 
you  indicate  which  results  were  on  the 
mark;  thus,  the  librarian  gains  a  better 
understanding  of  your  needs  and  can  im- 
prove the  search. 

With  relevance  feedback,  WAISes  can 
retrieve  documents  with  greater  ease  and 
speed.  You  no  longer  need  to  alter  a  SQL 
Boolean  operator  to  adjust  the  query  fil- 
ter; instead,  you  can  ask  for  "more  docu- 
ments like  this  one. " 

Dynamic  folders  can  also  possess  vi- 
tality, which  gives  the  folder  a  continu- 
ous charter  to  execute  queries  periodical- 
ly and  update  its  contents  with  new 
material.  A  folder's  charter  expresses 
purpose,  intent,  and  the  goal  that  you 
want  the  query  to  accomplish.  You  can 
build  the  folder  to  periodically  poll  serv- 
ers known  to  receive  frequently  updated 
material  that  matches  its  charter. 

If  the  search  retrieves  an  interesting 
document,  WAISes  let  you  select  a  por- 
tion of  the  text  and  use  it  as  an  adjunct  to 
the  initial  query.  Selecting  text  from  a 
portion  of  a  document  that  may  contain 
some  particularly  topical  or  relevant  in- 
formation and  using  it  to  refine  the 
search  is  an  innovative  approach  for  ex- 
ploring subjects  (see  figure  2). 

WAISes  also  let  you  chain  questions  by 
taking  the  results  of  a  previous  search, 
starting  a  new  question  with  different 
subject  matter,  and  dragging  the  previ- 
ous results  into  the  similar  to  menu  box 
(see  figure  3).  Chaining  questions  can 
either  broaden  or  narrow  a  search,  de- 
pending on  the  relevance-feedback  re- 
sults. 

The  recursive  capacity  of  dynamic 
folders  to  initiate  "sibling"  folders  dem- 
onstrates the  WAIS  potential  to  harness 
and  refine  subject  matter.  Query  refine- 
ment alters  the  charter  of  a  dynamic 
folder.  Sibling  dynamic  folders  execute 
directed  searches  and  can  have  an  auton- 
omous authority  to  broaden  the  range  of 
server  choices. 

Controlling  the  extent  of  search  expan- 
sion is  a  critical  issue.  For  individuals, 
cost  can  be  an  overwhelming  concern. 
WAIS  technology  does  not  yet  contain  an 
accounting  system  to  govern  search  crite- 
ria. Participating  information  services 
will  have  to  engineer  this  element  of  the 
technology  themselves. 

WAIS  Protocol 

WAISes  promote  connectivity  and  access 
to  remote  electronic-information  sources 
through  a  standard  protocol,  the  WAIS 
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protocol.  This  protocol  is  an  extension  of 
the  National  Information  Standards  Or- 
ganization (NISO)  Z39.50-1988  specifi- 
cation, which  defines  an  interface  to 
remote  information-retrieval  services 


and  library-protocol  applications.  The 
Z39.50  standard  is  the  backbone  of  the 
WAIS  protocol  and  the  foundation  for 
WAIS  applications  development. 

Incorporating  the  Z39.50  standard 
into  the  WAIS  protocol  frees  developers 
to  build  articulated  user  interfaces  for 
WAIS  applications.  The  interface  stan- 
dard isolates  the  server's  text-retrieval 
method,  such  as  SQL,  giving  the  applica- 
tion a  transparent  access  mode.  The  par- 
ticulars of  database  queries  are  hidden 
beneath  the  interface.  A  developer  only 
needs  to  be  sure  that  the  server  possesses 
an  equivalent  functionality  to  conduct 
remote  information-retrieval  transac- 
tions from  a  local  WAIS  workstation. 

Concealing  the  server's  implementa- 
tion through  the  WAIS  protocol  is  impor- 
tant in  another  respect  as  well.  Isolating 
the  implementation  implies  that  you  can 
specify  a  single,  more  palatable  query 
language.  The  WAIS  protocol  also  lets 
you  use  an  English-language-style  query 
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Figure  3:  Chaining  questions  permits  you  to  use  a  query  on  multiple  information 
sources  by  opening  a  new  question  and  dragging  previous  query  results  into  the 
similar  %o  field.  You  can  also  apply  the  similar  to  operation  to  invoke  a  new 
document  search,  as  in  this  example.  (Courtesy  of  Thinking  Machines  Corp.) 
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lexicon  instead  of  cryptic  SQL  or  fourth- 
generation  languages.  When  you  find  a 
document  that  is  appropriate,  the  WAIS 
protocol  automatically  handles  the 
download  process  from  the  server.  This 
is  quite  different  from  existing  services, 
where  manual  file-capture  mechanisms 
require  vigilance.  With  the  WAIS  proto- 
col, all  documents  look  like  they  are 
local  to  your  system. 

The  WAIS  protocol  incorporates  two 
important  modifications  that  the  NISO 
Z39.50  standard  does  not  address.  First, 
it  permits  hypermedia  document  trans- 
port. Most  documents  today  are  com- 
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posed  primarily  of  ASCII  text  codes  and 
sequences,  but  the  next  generation  of 
documents,  constructed  from  hyperme- 
dia and  multimedia  sources,  integrates 
images  and  fully  formatted  text.  These 
media  forms  are  rapidly  becoming  popu- 
lar and  conventional. 

Second,  the  WAIS  protocol  is  stateless 
for  the  server.  It  does  not  have  to  keep 
any  information  about  the  client  between 
transactions,  because  the  user's  state  is 
kept  on  the  local  workstation.  Every 
search  or  retrieval  operation  is  a  separate 
process.  The  contexts  are  decoupled 
under  the  statelessness  of  the  protocol. 
This  decoupling  lets  you  make  a  search, 
store  away  the  document  pointer,  and  re- 
trieve it  later. 

Further,  you  can  use  a  dynamic  folder 
to  pass  one  of  these  document  pointers  to 
someone  else  who  can  also  retrieve  the 
document.  A  document  pointer  is  like  an 
International  Standard  Book  Number  for 
the  electronic  age.  (The  ISBN  is  a  unique 
identification  assigned  to  each  publica- 
tion.) Passing  a  document  pointer  con- 
forms with  copyright  law  and  lets  you 


easily  return  to  the  document  source  in- 
stead of  making  copies. 

The  WAIS  protocol  is  designed  to 
transport  information  through  modems, 
X.25  communications,  or  network  back- 
bones. This  flexibility  provides  an  enor- 
mous framework  within  which  to  con- 
duct retrieval  transactions.  For  example, 
with  a  portable  computer,  you  could  con- 
nect with  a  WAIS  hub  through  a  modem 
and  post  dynamic  folders,  directing  the 
query  results  to  be  routed  to  your  office 
system  for  later  examination. 

Retrieval  Technology 

The  computing  infrastructure  needed  to 
implement  WAISes  varies  with  a  server's 
functionality.  A  Library  of  Congress 
WAIS,  with  25  terabytes  of  data,  could 
not  expeditiously  dispatch  queries  and 
function  if  a  serial  computer  were  used  to 
process  the  information.  For  a  problem 
of  this  magnitude,  massive  parallelism  is 
needed.  The  Connection  Machine's 
Text-Retrieval  System  is  a  viable  infor- 
mation-retrieval system  for  gigabyte-size 
databases. 

The  DowQuest  service  from  Dow 
Jones  runs  on  the  Connection  Machine. 
The  service  incorporates  approximately 
1  gigabyte  of  original  text  derived  from 
over  400  sources.  The,  Wall  Street  Jour- 
nal, the  Washington  Post,  Barron 's.  For- 
tune, Forbes,  and  several  regional  busi- 
ness and  technical  journals  are  includ- 
ed, covering  the  previous  eight  calendar 
months.  The  search  time  with  a  100- 
word  query  composed  of  typed  English 
and  relevance  feedback  (e.g. ,  "more  like 
that  one")  is  less  than  half  a  second.  The 
system  can  provide  access  to  many  giga- 
bytes of  text  and  to  thousands  of  users 
interactively. 

The  projections  for  the  Connection 
Machine  system  indicate  that  when  it  is 
scaled  to  a  1-terabyte  database  with  10- 
word  queries,  obtaining  an  answer  with- 
in 10  seconds  or  less  is  highly  probable. 
This  performance  is  accomplished  by 
harnessing  the  Connection  Machine's 
65,536  separate  processors  to  execute  a 
parallel  index  algorithm  (see  reference 
6).  These  estimates  are  phenomenal  and 
truly  indicative  of  the  computing  power 
manifest  in  parallel  systems.  No  serial 
machine  can  even  come  close  to  this  level 
of  performance. 

The  Connection  Machine  system  gen- 
erates these  results  by  searching  the  en- 
tire contents  of  an  archive,  not  a  repre- 
sentative abstract  of  a  keyword  frequency 
table.  Each  document  within  the  archive 
is  used  to  determine  a  match.  This  is  not 
typical  for  systems  organized  around 
serial  computers,  and  it  is  another  dra- 


matic demonstration  of  parallel-comput- 
ing technology. 

The  cost  of  a  system  like  the  Connec- 
tion Machine  runs  in  the  millions  of  dol- 
lars. But  a  Macintosh  with  a  100-mega- 
byte  hard  disk  drive  or  a  386-based  PC 
can  serve  the  typical  WAIS  user. 

Immense  Promise 

The  prototype  WAIS  user  interface  and 
protocol  are  currently  being  beta-tested 
at  Thinking  Machines,  Apple  Computer, 
and  Dow  Jones  News/Retrieval.  Think- 
ing Machines,  the  principal  developer  of 
the  WAIS  architecture  and  software, 
plans  to  share  the  WAIS  protocol  free  of 
charge  and  hopes  to  help  user-interface 
developers  build  interfaces  to  WAIS 
servers. 

While  still  a  research  project  that  is 
undergoing  development  and  refinement, 
the  WAIS  holds  immense  promise.  Infor- 
mation commerce,  buoyed  through  the 
widespread  acceptance  of  computer  sys- 
tems and  networks,  forces  individuals 
and  companies  to  expedite  transactions 
and  simplify  activities.  These  coveted 
sources  of  efficiency  stand  out  as  promi- 
nent allies  of  competitive  advantage.  ■ 
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Mem's  symptomless  herpes  threatens  baby 

Herpes  can  be  deadly  for  newborns  who  acquire  the  viral 
infection  from  their  mothers  during  labor  or  delivery  To 
complicate  matters,  only  about  25  percent  of  adults  with  genital 
herpes  display  telltale  symptoms  (SN;  6/28/86,  p.410).  Seattle- 
based  researchers  have  now  tested  a  group  of  mothers  during 
labor  for  asymptomatic  but  infectious  herpes,  and  have 
reached  some  disturbing  conclusions.  Their  data  suggest  that 
such  screening  will  neither  identify  all  women  who  risk  passing 
the  virus  to  their  newborns,  nor  allow  physicians  to  save 
infected  babies  from  a  devastating  outcome. 

Zane  A.  Brown  and  his  co-workers  at  the  University  of 
Washington  identified  56  women  with  active,  asymptomatic 
herpes  infections  among  the  15,923  laboring  women  tested  at 
two  local  hospitals  between  1984  and  1989. 

Type  1  herpesvirus,  the  "oral"  form  usually  associated  with 
cold  sores,  rarely  infects  the  genitalia.  However,  says  Brown, 
"our  data  indicate  that  when  it  is  present  [in  the  genitalia],  it 
transmits  more  readily  [than  Type  2]  to  infants."  Three  of  the 
five  women  (60  percent)  with  Type  1  herpes  infected  their 
babies,  compared  with  seven  of  the  51  women  (14  percent)  with 
active  Type  2  herpes,  the  researchers  report  in  the  May  2  New 
England  Journal  of  Medicine. 

The  Type  1  infection  almost  never  harms  a  newborn.  Brown 
observes.  In  his  study  all  infants  contracting  Type  1  herpes 
developed  normally  By  contrast,  one  of  the  seven  infants  with 
Type  2  herpes  died,  and  four  developed  disabling  encephalitis. 

"The  really  big  risk  of  neonatal  infection  and  damage  or 
death  occurs  if  a  woman  first  acquires  [Type  2]  herpes  late  in 
her  pregnancy,"  Brown  says.  And  fully  one-third  of  the  mothers 
with  asymptomatic  herpes  in  this  study  were  experiencing 
their  first,  or  "primary,"  episode  of  this  periodically  recurring 
disease,  he  adds.  Although  the  infected  infants  were  identified 
within  24  hours  of  birth  —  far  earlier  than  usual  —  and 
immediately  treated  with  antiviral  drugs,  "we  didn't  signifi- 
cantly change  the  ultimate  outcome,"  Brown  says.  "Kids  with 
Type  2  disease  got  sick  no  matter  what  we  did." 

He  notes  another  unsettling  finding:  Unlike  most  adults, 
infants  carrying  antibodies  to  the  Type  1  virus  did  not  contract 
a  milder-than-normal  version  of  the  Type  2  disease. 

Fortunately,  the  more  serious.  Type  2  virus  "transmits 
reluctantly,"  Brown  concludes.  But  transmission  may  occur  in 
unexpected  ways.  He  and  his  colleagues  speculate  that  in  eight 
of  the  babies  with  herpes  infections,  physicians  may  have 
unwittingly  opened  a  portal  for  the  virus  by  inserting  elec- 
trodes into  the  fetus'  scalp  to  monitor  prenatal  heartbeats. 

A  gold-plated  test  for  Lyme  disease 

Current  tests  for  Lyme  disease  detect  the  antibodies  pro- 
duced when  a  person's  immune  system  responds  to  Borrelia 
burgdorferi,  the  tick-borne  bacteria  that  cause  the  disease.  But 
because  some  people  are  slow  to  make  such  antibodies,  the  test 
doesn't  always  provide  an  accurate  diagnosis.  If  left  untreated, 
Lyme  disease  can  cause  chronic  arthritic  symptoms. 

Scientists  have  now  developed  a  prototype  test  that  directly 
spots  bits  of  B.  burgdorferi.  Working  at  the  Rocky  Mountain 
Laboratories  of  the  National  Institute  of  Allergy  and  Infectious 
Diseases  in  Hamilton,  Mont.,  the  group  created  gold-tagged 
antibodies  that  home  in  on  two  of  the  bacteria's  surface 
proteins.  The  gold  enables  scientists  to  image  bacteria-binding 
antibodies  in  the  blood  with  an  electron  microscope,  thereby 
clinching  the  microbes'  presence,  the  researchers  report  in  the 
June  Journal  of  Clinical  Microbiology. 

David  W  Dorward,  who  directed  the  work,  notes  that  most 
medical  laboratories  lack  the  electron  microscopes  needed  to 
conduct  the  new  test.  However,  he  says,  the  gold-labeled 
antibodies  could  be  adapted  for  use  in  a  routine  diagnostic  test. 
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Ivors  Peterson  reports  from  San  Jose,  Calif.,  at  the  Pliysics  Computing 
'91  conference 

Navigating  the  information  swamp 

The  ubiquitous  lab  notebook,  with  its  dog-eared  corners, 
stained  pages  and  scribbled  entries,  may  one  day  give  way  to 
an  electronic  analog  that  permits  not  only  the  recording  of  data 
but  also  the  sharing  of  information  among  researchers  scat- 
tered throughout  the  world.  Researchers  at  Baylor  College  of 
Medicine  in  Houston  have  developed  a  sophisticated,  com- 
puter-based scheme,  called  the  Virtual  Notebook  System,  that 
allows  its  user  to  gather,  organize  and  annotate  information 
selected  from  a  variety  of  sources. 

With  such  a  notebook,  a  medical  researcher  interested  in  the 
diagnosis  of  a  certain  ailment,  for  example,  can  readily 
assemble  a  package  consisting  of  X-ray  images,  personal 
comments,  citations,  journal  articles,  news  items,  electronic- 
mail  extracts  and  other  relevant  pieces  of  information.  More- 
over, the  researcher  can  instantly  share  that  information  with 
others  who  use  the  same  system,  even  if  they  are  thousands  of 
miles  away.  "You  can  even  write  in  someone  else's  notebook," 
says  Kevin  B.  Long,  who  directed  the  project. 

Designed  to  facilitate  collaboration,  the  system's  key  element 
consists  of  software  that  masks  the  underlying  maze  of 
computers  and  computer  networks  that  often  stands  in  the  way 
of  efficient  and  convenient  communication  among  researchers 
working  with  different  computer  equipment.  The  Virtual 
Notebook  System  also  incorporates  a  new  programming  ap- 
proach for  simplifying  the  indexing  and  retrieval  of  information 
stored  in  computers.  A  specially  programmed,  information- 
seeking  computer  —  known  as  the  Wide  Area  Information 
Server  and  developed  under  the  direction  of  BjeastgrKahle  of 
Thinking  Machines  Corp.  in  Cambridge,  Mass.  -  resporias  to 
requests  typed  in  English.  Users  don't  have  to  know  exactly 
how  to  find  the  information  they  need;  nor  do  they  have  to 
remember  any  special  instructions  to  locate  data. 

Best  suited  for  groups  of  researchers  already  linked  by 
computer  networks,  the  Virtual  Notebook  System  may  prove  a 
crucial  component  of  large  collaborative  efforts.  Officials  with 
the  Superconducting  Super  Collider  are  investigating  the 
system  as  a  possible  means  of  sharing  and  analyzing  experi- 
mental data  when  the  accelerator  is  eventually  completed. 

Exploring  the  virtual  wind 

Calculations  of  the  direction  and  speed  at  which  air  flows 
past  a  complicated,  three-dimensional  object,  such  as  an 
airplane,  generate  huge  quantities  of  data.  Conventional  two- 
dimensional  graphic  images  derived  from  these  data  often  fail 
to  convey  the  flow's  complexity.  Now,  a  team  of  researchers  has 
assembled  a  primitive,  prototype  system  for  exploring  such 
flow  patterns,  in  effect  allowing  an  investigator  to  step  into  and 
interact  with  a  computer-generated  environment. 

Steve  Bryson  and  Creon  Levit  of  the  NASA  Ames  Research 
Center  in  Mountain  View,  Calif.,  used  commercially  available 
components  to  create  their  prototype  system,  known  as  the 
Virtual  Windtunnel. 

Through  computer  graphics  and  special  input  devices,  the 
system  creates  the  illusion  of  being  surrounded  by  a  flow.  The 
user  looks  through  a  boom-mounted  device  resembling  a 
diver's  mask,  which  contains  two  small  television  sets  to 
produce  a  wide-angle,  stereoscopic  image.  A  computer  tracks 
the  viewer's  head  position  and  generates  the  appropriate 
views.  The  user  also  wears  a  flexible  glove  fitted  with  sensors  to 
manipulate  the  image  in  various  ways.  For  example,  to  visualize 
the  direction  of  flow  in  a  particular  region,  a  researcher  can  use 
the  glove  to  specify  the  starting  point  for  a  computer-rendered 
stream  of  smoke,  and  then  walk  around  to  see  the  resulting  flow 
pattern  from  different  angles. 

"It  puts  the  computations  right  in  front  of  the  researcher," 
Bryson  says.  "It  allows  real  interaction  with  the  data." 
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Total  usage  of  Quake 
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Number  of  Clients:  6729 
Number  of  Searches:  1 2652 
Number  of  Retrievals:  33897 
Total  Transactions:  46549 
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Mom's  symptomless  herpes  threatens  baby 

Herpes  can  be  deadly  for  newborns  who  acquire  the  viral 
infection  from  their  mothers  during  labor  or  delivery.  To 
complicate  matters,  only  about  25  percent  of  adults  with  genital 
herpes  display  telltale  symptoms  (SN:  6/28/86,  p.410).  Seattle- 
based  researchers  have  now  tested  a  group  of  mothers  during 
labor  for  asymptomatic  but  infectious  herpes,  and  have 
reached  some  disturbing  conclusions.  Their  data  suggest  that 
such  screening  will  neither  identify  all  women  who  risk  passing 
the  virus  to  their  newborns,  nor  allow  physicians  to  save 
infected  babies  from  a  devastating  outcome. 

Zane.  A.  Brown  and  his  co-workers  at  the  University  of 
Washington  identified  56  women  with  active,  asymptomatic 
herpes  infections  among  the  15,923  laboring  women  tested  at 
two  local  hospitals  between  1984  and  1989. 

Type  1  herpesvirus,  the  "oral"  form  usually  associated  with 
cold  sores,  rarely  infects  the  genitalia.  However,  says  Brown, 
"our  data  indicate  that  when  it  is  present  [in  the  genitalia],  it 
transmits  more  readily  [than  Type  2]  to  infants."  Three  of  the 
five  women  (60  percent)  with  Type  1  herpes  infected  their 
babies,  compared  with  seven  of  the  51  women  (14  percent)  with 
active  Type  2  herpes,  the  researchers  report  in  the  May  2  New 
England  Journal  of  Medicine. 

The  Type  1  infection  almost  never  harms  a  newborn,  Brown 
observes.  In  his  study,  all  infants  contracting  Type  1  herpes 
developed  normally.  By  contrast,  one  of  the  seven  infants  with 
Type  2  herpes  died,  and  four  developed  disabling  encephalitis. 

"The  really  big  risk  of  neonatal  infection  and  damage  or 
death  occurs  if  a  woman  first  acquires  [Type  2]  herpes  late  in 
her  pregnancy,"  Brown  says.  And  fully  one-third  of  the  mothers 
with  asymptomatic  herpes  in  this  study  were  experiencing 
their  first,  or  "primary,"  episode  of  this  periodically  recurring 
disease,  he  adds.  Although  the  infected  infants  were  identified 
within  24  hours  of  birth  —  far  earlier  than  usual  —  and 
immediately  treated  with  antiviral  drugs,  "we  didn't  signifi- 
cantly change  the  ultimate  outcome,"  Brown  says.  "Kids  with 
Type  2  disease  got  sick  no  matter  what  we  did." 

He  notes  another  unsettling  finding:  Unlike  most  adults, 
infants  carrying  antibodies  to  the  Type  1  virus  did  not  contract 
a  milder-than-normal  version  of  the  Type  2  disease. 

Fortunately,  the  more  serious.  Type  2  virus  "transmits 
reluctantly,"  Brown  concludes,  But  transmission  may  occur  in 
unexpected  ways.  He  and  his  colleagues  speculate  that  in  eight 
of  the  babies  with  herpes  infections,  physicians  may  have 
unwittingly  opened  a  portal  for  the  virus  by  inserting  elec- 
trodes into  the  fetus'  scalp  to  monitor  prenatal  heartbeats. 

A  gold-plated  test  for  Lyme  disease 

Current  tests  for  Lyme  disease  detect  the  antibodies  pro- 
duced when  a  person's  immune  system  responds  to  Borrelia 
burgdorferi,  the  tick-borne  bacteria  that  cause  the  disease.  But 
because  some  people  are  slow  to  make  such  antibodies,  the  test 
doesn't  always  provide  an  accurate  diagnosis.  If  left  untreated, 
Lyme  disease  can  cause  chronic  arthritic  symptoms. 

Scientists  have  now  developed  a  prototype  test  that  directly 
spots  bits  of  B.  burgdorferi.  Working  at  the  Rocky  Mountain 
Laboratories  of  the  National  Institute  of  Allergy  and  Infectious 
Diseases  in  Hamilton,  Mont.,  the  group  created  gold-tagged 
antibodies  that  home  in  on  two  of  the  bacteria's  surface 
proteins.  The  gold  enables  scientists  to  image  bacteria-binding 
antibodies  in  the  blood  with  an  electron  microscope,  thereby 
clinching  the  microbes'  presence,  the  researchers  report  in  the 
June  Journal  of  Clinical  Microbiology. 

David  W  Dorward,  who  directed  the  work,  notes  that  most 
medical  laboratories  lack  the  electron  microscopes  needed  to 
conduct  the  new  test.  However,  he  says,  the  gold-labeled 
antibodies  could  be  adapted  for  use  in  a  routine  diagnostic  test. 
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luars  Peterson  reports  from  San  Jose,  Calif,  at  the  Physics  Computing 
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Navigating  the  information  swamp 

The  ubiquitous  lab  notebook,  with  its  dog-eared  corners, 
stained  pages  and  scribbled  entries,  may  one  day  give  way  to 
an  electronic  analog  that  permits  not  only  the  recording  of  data 
but  also  the  sharing  of  information  among  researchers  scat- 
tered throughout  the  world.  Researchers  at  Baylor  College  of 
Medicine  in  Houston  have  developed  a  sophisticated,  com- 
puter-based scheme,  called  the  Virtual  Notebook  System,  that 
allows  its  user  to  gather,  organize  and  annotate  information 
selected  from  a  variety  of  sources. 

With  such  a  notebook,  a  medical  researcher  interested  in  the 
diagnosis  of  a  certain  ailment,  for  example,  can  readily 
assemble  a  package  consisting  of  X-ray  images,  personal 
commentcvcitations,  journal  articles,  news  items,  electronic- 
mail  extracts  and  other  relevant  pieces  of  information.  More- 
over, the  researcher  can  instantly  share  that  information  with 
others  who  use  the  same  system,  even  if  they  are  thousands  of 
miles  away.  "You  can  even  write  in  someone  else's  notebook," 
says  Kevin  B.  Long,  who  directed  the  project. 

Designed  to  facilitate  collaboration,  the  system's  key  element 
consists  of  software  that  masks  the  underlying  maze  of 
computers  and  computer  networks  that  often  stands  in  the  way 
of  efficient  and  convenient  communication  among  researchers 
working  with  different  computer  equipment.  The  Virtual 
Notebook  System  also  incorporates  a  new  programming  ap- 
proach for  simplifying  the  indexing  and  retrieval  of  information 
stored  in  computers.  A  specially  programmed,  information- 
seeking  computer  -  known  as  the  Wide  Area_Infozination 
Server  and  developed  under  the  direction  o^rewsterKahiiof 
Thinking  Machines  Corp.  in  Cambridge,  Mass.  —  responds  to 
requests  typed  in  English.  Users  don't  have  to  know  exactly 
how  to  find  the  information  they  need;  nor  do  they  have  to 
remember  any  special  instructions  to  locate  data. 

Best  suited  for  groups  of  researchers  already  linked  by 
computer  networks,  the  Virtual  Notebook  System  may  prove  a 
crucial  component  of  large  collaborative  efforts.  Officials  with 
the  Superconducting  Super  Collider  are  investigating  the 
system  as  a  possible  means  of  sharing  and  analyzing  experi- 
mental data  when  the  accelerator  is  eventually  completed. 

Exploring  the  virtual  wind 

Calculations  of  the  direction  and  speed  afwhich  air  flows 
past  a  complicated,  three-dimensional  object,  such  as  an 
airplane,  generate  huge  quantities  of  data.  Conventional  two- 
dimensional  graphic  images  derived  from  these  data  often  fail 
to  convey  the  flow's  complexity.  Now,  a  team  of  researchers  has 
assembled  a  primitive,  prototype  system  for  exploring  such 
flow  patterns,  in  effect  allowing  an  investigator  to  step  into  and 
interact  with  a  computfir::genejated  environment. 

Steve  Bryson  and[C£eon_LevitJof  the  NASA  Ames  Research 
Center  in  Mountain  View,  Calif.,  used  commercially  available 
components  to  create  their  prototype  sj'stem,  known  as  the 
Virtual  Windtunnel. 

Through  computer  graphics  and  special  input  devices,  the 
system  creates  the  illusion  of  being  surrounded  by  a  flow.  The 
user  looks  through  a  boom-mounted  device  resembling  a 
diver's  mask,  which  contains  two  small  television  sets  to 
produce  a  wide-angle,  stereoscopic  image.  A  computer  tracks 
the  viewer's  head  position  and  generates  the  appropriate 
views.  The  user  also  wears  a  flexible  glove  fitted  with  sensors  to 
manipulate  the  image  in  various  ways.  For  example,  to  visualize 
the  direction  of  flow  in  a  particular  region,  a  researcher  can  use 
the  glove  to  specify  the  starting  point  for  a  computer-rendered 
stream  of  smoke,  and  then  walk  around  to  see  the  resulting  flow 
pattern  from  different  angles, 

"It  puts  the  computations  right  in  front  of  the  researcher," 
Bryson  says.  "It  allows  real  interaction  with  the  data." 
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Frototype  links  Mac, 
Connection  Machine 


By  Henry  Norr 

Cupertino,  Calif.  —  Thinking 
Machines  Corp.,  a  pioneer  in  the 
development  of  high-powered 
parallel-processing  supercomput- 
ers, has  joined  with  Apple,  Dow 
Jones  &  Co.  and  KRMG  Peat  Mar- 
wick  to  develop  a  new  technology 
designed  to  simplify  the  retrieval  of 
textual  information  stored  in  per- 
sonal files,  corporate  records  and 
remote  databases. 

Called  the  Wide  Area  Informa- 
tion Seiver  (WAIS)  project,  the  col- 
laborative venture  has  been  under 
way  for  almost  two  years.  Peat  Mar- 
wick  recently  completed  a  four- 
month  e,\'periment  with  the  system,, 
using  WAIStation,  a  prototype  Mac 
front  end  developed  by  Thinking . 
Machines  o  f  C  a  in  b  r  i  d  g  e , .  M  a  s  s . , 
^ErigiheerV  frd'hv  A'pple^^^ 
Technology  Group  have  combined, 
the  WAIS  technology  with  a  custom 
interface  to  build  a  prototvpe  per- 
sonal electronic  newspaper. 

The  WAIS  project  was  designed 
in  part  to  address  problems  caused 
by  the  prohferation  of  electronic 
data  within'large  organizations. 

"Corporations  are  starting  to  gag 


Rn*.,  cT'^^^^'^.• 


Brewster  Kahie,  WAIS  project  leader,  helped  develop  an  experimental  text-retrieval  system 
that  can  use  a  Thinking  Machines  supercomputer  as  a  server  and  the  Mac  as  a  front  end. 


on  gigabytes  of  word  processing 
.  files,  memos,  reports,  articles  and  E- 
'Thiul' archives," 'said  Brews^^ 
WAIS  project  leader  for  Thinking 
Machines.  "Corporate  memor)'  is 
stored  in  this  form,  but  executives 
■  have  no  easy  way  to  get  at  it." 

But  the  , WAIS  project  was 
intended  from  the  beginning, 
Kalile  said,  to  be  more  than  a  tradi- 
tional executive  information  system 
v/orking  only,  within  corporate 


;.:  bounds.  The  objective  was  to  .lay 
?  the  foundations  .fox  a :  scalable  s'ys- 
'■'tem'thiif:  would' allow  users  to  tap  a 
variety  of  diUa  sources,  including 
large  '  coinmercial  '  databases, 
through  a  uniform  interface.  Users, 
according  to  the," plan,  should  be 
able  to  s.earch  for- any  available 
infonnation  xvithout  liaving  to  mas-, 
ter  the  internal  organization  and 
query  techniques  of  each  source.  , 
See  Thinking  Machines,  Page  24 


Peat  Marwick  tries 'partner-friendly' sysfem 


1- ■  Whfert' ^fioi  afe;oSi5^nk&g; 
Machines  Corp.,  Dow  Jones  &. 
Co.  and  Apple  first  broached  the 
concept  of  the  Wide  Area  Infor- 
mation Server  (WAIS)  with 
KPMG  Peat  Marwick,  representa- 
tives of  the  accounting  giant  were 
intrigued  but  cautious,  accordiiig 
to  Brewster  Kalile,  project  leader 
for  Thinking  Machines. 

They  weren't  interested,  he  said, 
in  another  complex  querying  appli- 
cation that  busy  tax  consultants, 
accountants  anci  managers  would 
never  bother  to  use.  But  they 
agreed  to  participate  in  the  project, 
according  to  Kalile,  on  the  promise 
of  a  system  that  would  be  genuine- 
ly "partner-friendly,"  with  "no  alge- 
bra —  no  ifs,  ands  or  buts." 

After  a'  year  of  preliminary 
work,  an  experimental  WAIS 
R&D  project  went  on-line  at  Peat 
Marwick  last  October.  About  10 
users  at  the  company's  Montvale, 
N.J.,  headquarters,  including 
"very  senior  partners,"  took  part  in 
the  experiment,  along  with  two 
others  in  Manhattan  and  10  more 
on  the  West  Coast,  according  to 
Robin  Palmer,  senior  manager 


?'kn^ff:-^\^^TS'- j  eci -I^^ 
KPMG  Peat  Marwick  in  San  Jose, 
Calif  The  remote  users  were  con- 
nected by  leased  lines  to  a  WAIS 
server  running  on  a  Connection 
Machine,  a  Thinking  Machines 
parallel-processing  system,  in- 
stalled in  Montvale. 

The  Peat  Marwick  e.xperiment 
relied  on  WAIStation,  a  Mac- 
based  client  software  program 
developed  by  Thinking  Machines, 
■AS  a  front  end.  To  prepare  a  query, 
users  need  only  enter  the  subject 
they  are  interested  in,  in  English 
—  "IBM  and  Motorola,"  for 
instance,  or  "recent  developments 
in  personal  computers"  — ■  in  a 
text  field  labeled  "Look  for  docu- 
ments about."  They  then  drag 
icons-  representing  possible 
sources,  local  or  remote,  into 
another  field. 

When  the  query  is  run,  the 
Macintosh-b.ased  front  end 
encodes  the  search  string  accord- 
ing to  the  WAIS  protocol  and 
passes  it  to  the  specified  sei-vers. 
Each  server  translates  the  queiy 
into  its  own  language,  locates 
matching  articles  and  returns 


.i'.^tbe-resfifts^ijftlie^  end;-;-- 
The  WAIStation  application 
then  displays  headlines  for  each 
article;  the  citations  are  ranked 
according  to  probable  relevance, 
based  on  algorithms  that  consid- 
er the  position,  frequency  and 
proximity  of  desired  terms  within 
the  text.        ■     ,  . 

By  double-clicking  on  the  head- 
line,' users  ciin  get  the  full  text  of 
any  of  the  articles.  And  if  the  user 
,-drags  the  most  useful  titles  into  ;  a 
bin  labeled  "Similar  to"  and 
reruns  the  search,  the  system  will 
track  down  additional  articles  that 
share  a  large  number  of  words 
with  those  selected. 

Peat  Marwick  completed  Its 
WAIStation  testing  in  February. 
In  part  because  the  cost  of  main- 
taining a  real-time  wide-area  Hnk 
among  its  many  offices  would,  be 
"substantial,"  according  to  Palmer, 
the  company  has  not  made  a  coir^- 
mitment  to  the  system  and  is  still 
considering  a  variety  of  alterna- 
tives. But,  he  said,  "we  are  still 
extremely  interested  in  the  WAIS 
concepts.  It's  a  most  promising 
technology."  —  By  Henry  Norr 


Thinking  Machines  FnmPagc22 

The  X'WVIS  svsteni  hus  three  coniponents: 
>  Server  software.  Any  inFonnation 
source  cupuhle  of  locating  and  presenting 
text  in  response  to  a  request  in  WAIS  format 
can  fruiction  as  a  seiver;  the  source  can  be 
on  the  user's  owii  machine,  on  a  LAN  or  at  a 
remote  site  connected  by  modem.  The 
WALS  client  software  can  keep  ti'ack  of  mul- 
tiple .seivers.  .search  any  or  all  in  response  to 
a  single  request  and  consolidate  the  results. 

Thinking  Machines  now  includes  the 
WAIS  text-indexing  and  retrieval  software 
free  with  its  Connecdon  Machines,  a  line  of 
ma.ssi\-elv  parallel  systems  that  range  in  price 
from  .8100,000  to  S5  million,  according  to 
Kahle.  In  addition,  the  compunie.s  partici- 
pating in  the  project  developed  a  sample 
sen-er  that  runs  on  standard  Unix  systems. 
But  anv  text-retrie\'al  program  on  any  plat- 
form, including  the  Mac,  could  be  adapted 
to  Funcdon  as  a  W'AIS  sen-er. 

>  Protocol.  To  Foster  the  development  oF 
WAIS-compatible  data  sources,  the  four 
companies  created  an  open  protocol  for 
transmitting  queries  and  responses.  It  is 
based  on  an  e.Kisting  standard,  the  National 
Information  Standards  Organization's 
Z39.50  protocol,  but  is  enhanced  in  several 
ways,  such  as  by  the  addition  of  support  For 
audio  and  \ideo  information. 

>■  Clients.  W'AIS  was  designed  to  support 
a  variety  of  interfaces  running  on  various 
platfornis  and  tailored  to  different  niches. 

The  system  does  not  rely  on  a  specialized 
querv  language;  the  front  end  simply  passes 
English-language  search  strings  entered  by 
the'  user  to  the  sen-er. 

In  addition  to  the  prototype  WAIStadon 
interface  and  Apple's  experimental  personal 
newspaper,  front  ends  already  are  available 
for  the  X  \\'indow  System  and  GNU  emacs, 
an  e.xten.sible  text  editor  diat  runs  under  a 
freelv  distributed  Unix-like  operating  systeni 
developed  at  the  Massachu.setts  Institute  of 
Technology  in  Cambridge. 

To  proinote  the  WAIS  concept,  ThinkHng 
Machines  is  making  source  code  for  the  sy.s- 
tem  available  over  the  Internet  or  by  mail. 
The  code  comes  Free  of  charge  but  without 
support.  Using  the  soFtware,  prograunners  at 
MIT  and  elsewhere  alreaily  have  created 
more  than  20  WAIS  seivers,  including  a  poet- 
i-y  seiver,  a  weather  sener  and  a  catalog  ol 
irovernmont  programs.  Thinking  Machines 
will  maintain  a  publicly  accessible  directoiy 
of  scn-crs,  which  will  include  descriptions  ol 
all  known  sen-ers  and  .special  files  that  allow 
WAIS  Front  ends  to  plug  into  them.n 
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Large  PC  Libraries 
Are  Being  Developed 

By  JOHN  MARKOFF 

The  development  of  a  nationwide 
data  network  will  allow  personal 
computer  users  to  tap  sources  as 
large  as  the  Library  of  Congress  or 
receive  their  own  personalized  elec- 
tronic newspapers. 

Several  innovations,  taken  togeth- 
er, have  already  demonstrated  that 
searching  vast  computer  data  bases 
can  be  easier  than  consulting  a  card 
catalogue,  and  not  nearly  as  difficult 
or  expensive  as  computer  searches 
are  today.  Computer  users  might 
read  some  Dickens  more  readily  than 
they  could  check  out  David  Copper- 
field  from  the  local  library. 

Those  in  the  industry  say  that  users 
with  little  computer  skills  will  soon  be 
able  to  search  through  several  tera- 
bytes of  information,  or  several  tril- 
lion characters  of  text,  in  seconds. 
The  Library  of  Congress,  with  80  mil- 
lion items,  contains  an  estimated  25  • 
terabytes  of  information. 


Already,  an  experimental  com- 
puter library  has  linked  150  universi- 
ties to  40  sources  of  information, 
ranging  from  Naitional  Institutes  of 
Health  data  to  corporate  documents 
and  Shakespeare's  plays.  New  soft- 
ware allows  users  to  browse  or  zero 
in  on  particular  information. 

As  methods  of  retrieving  informa- 
tion are  standardized  and  perfected, 
industry  executives  and  computer 
scientists  say,  thousands  of  new  serv- 
ices, ranging  from  electronic  newspa- 
pers to  the  computer  equivalent  of 
free  public  hbraries,  will  blossom. 
"Everyone  is  realizing  how  impor- 
tant it  is  to  get  into  the  mass  market 
for  information,"  said  Thomas  Koulo- 
poulos,  president  of  Delphi  Consulting 
Group,  a  Boston  market  research 
firm. 


Such  ready  access  to  huge  amounts 
of  computerized  information  has 
been  the  dream  of  many  in  the  indus- 
try. But  a  lack  of  computing  power, 
effective  software  and  high-speed 
digital  networks  has  stalled  progess 
until  recently. 

If  many  of  the  technical  problems 
are  being  solved,  major  business  and 
political  disputes  remain.  The  re- 
searchers acknowledge  that  they 
must  resolve  several  questions  of  pri- 
vacy and  pricing  before  they  can  put 
the  new  methods  to  commercial  use. 

Many  sources  of  information,  like 
government  documents,  might  be 
available  free,  but  other  services,  in- 
cluding electronic  newspapers,  will 
be  available  only  to  those  who  pay. 
The  industry  has  yet  to  settle  on  ways 
to  protect  and  charge  for  intellectual 
property  in  a  computer  network 
where  information  can  be  copied  in- 
stantly. But  to  encourage  progress, 
the  Thinking  Machines  Corporation,  a 
Cambridge,  Mass.,  supercomputer 
manufacturer,  has  made  its  software 
ivailable  at  no  charge. 

Some  industry  enthusiasts  say  the 
new  technology  will  transform  the 


Brewster  Kahle  was  the  leader  of  the  development 
team  at  the  Thinking  Machines  Corporation  for  a 
nationwide  computerized  library  system.  His  team's 
software  links  a  CM2A  Connection  Machine,  left, 


Mike  Theiler  for  The  New  York  Times 

with  a  personal  computer  or  work  station  like  the 
Apple  Macintosh  II  at  right.  Using  high-speed  data 
highways,  the  two  machines  can  function  together 
although  they  may  be  thousands  of  miles  apart. 
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way  computerized  informaiion  is 
sold.  Mitchell  Kapor,  the  founder  of 
the  Lotus  Development  Corporation, 
predicts  the  growth  of  a  new  industry 
as  significant  as  the  personal  com- 
puter business.  Some  companies,  like 
Dow  Jones  &  Company,  that  already 
provide  computerized  information 
over  telephone  lines  have  taken  part 
in  developing  the  new  computer  li- 
brary. 

The  Search  Is  Simplified 

In  1989,  Thinking  Machines  enlisted 
the  support  of  Dow  Jones,  Apple  Com- 
puter Inc.  and  the  KPMG  Peat  Mar- 
wick  accounting  and  consulting  firm 
to  design  the  computer  library,  called 
Wide  Area  Information  Servers,  or 
WAIS  (pronounced  ways).  The  sys- 
tem permits  computer  users  to 
quickly  search  through  a  huge  vol- 
ume of  information  even  if  it  is  stored 
at  several  distant  locations. 

The  system  lets  users  conduct 
searches  by  typing  comnaon  English 
phrases  instead  of  more  complicated 
computer  commands.  While  current 
systems  like  Dialog  and  Nexis  re- 
quire users  to  specify  precisely  the 
information  they  want,  the  new  sys- 
tem can  respond  to  a  user's  infer- 
ences. It  initially  presents  a  sample 
list  of  documents.  The  user  chooses 
one  or  several,  and  then  a  "relevance 
feedback"  program  presents  other 
documents  most  like  the  ones  select- 
ed. 

"This  solves  the  problem  of  how  to 


It  will  soon  be 
possible  to  search 
through  millions 
of  items  in 
seconds. 


get  to  the  information  you  need,  get- 
ting not  too  much  and  not  too  little," 
said  Esther  Dyson,  editor  of  Release 
1.0.  a  computer  industry  newsletter. 

This  is  a  sharp  contrast  to  the  way 
services  operate  today,  Ms.  Dyson 
said.  A  computer  user  may  need  to 
call  seven  or  eight  separate  data 
bases  depending  on  the  kind  of  infor- 
mation needed. 

The  WAIS  system  lets  users  of 
Apple  personal  computers  harness  a 
network  of  Thinking  Machines  super- 
computers and  smaller  "server" 
computers  to  search  data  bases 
stored  by  Dow  Jones,  KPMG  and  sev- 
eral corporations  and  universities. 
Users  can  also  read  electronic  mail, 
enter  their  corporate  electronic  li- 
braries and  summon  up  a  wide  vari- 
ety of  documents,  newspapers  and 
magazines. 

A  'Corporate  Memory' 

At  Thinking  Machines,  the  WAIS 
system  serves  as  a  "corporate  mem- 
ory," allowing  employees  to  retrieve 
memos,  documents  and  other  inter- 


nal informalton.  Employees  who  may 
not  be  working  together  can  sliai  e  ex- 
pertise. 

"If  someone  did  something  in  Los  • 
Angeles  and  I'm  sitting  in  San  Fran- 
cisco, I  may  not  know  about  the  . 
work,"  said  Robin  Palmer,  a  senior 
manager  at  Peat  Marwick. 

WAIS  delivers  information  over  the 
Internet,  a  collection  of  2,600  high- 
speed public  and  private  computer 
networks.  This  Government-spon- ' 
sored  system  of  data  highways  is  rap- 
idly being. improved  and 'turned  to 
commercial  uses. 

The  market  for  software  that-  al- 
lows the  rapid  retrieval  of  computer- 
ized text  is  small  but  growing,  ac- 
cording to  industry  analysts.  In  1989, 
the  United  Stales  had  fewer  than 
60,000  users;  by  the  next  year,  total 
sales  were  about  $120  million.  The 
Delphi  Consulting  Group  expects  the 
market  to  grow  to  160,000  users  and 
$235  million  by  1992. 

"Informaiion  retrieval  technology 
is  starting  to  spread  from  supercorii- 
puters  all  the  way  down  to  personal 
computers,"  said  Brewster  Kahle,  a 
Thinking  Machines  scientist  who  has 
led  the  WAIS  experiment. 

The  WAIS  system  is  built  on  a 
procedure  for  retrieving  information 
developed  by  librarians  who  inilially 
set  out  to  computerize  their  card 
catalogues.  The  procedure  —  known 
in  the  field  as  Z39.50  —  now  has  the 
support  of  the  Library  of  Congress, 
Apple,  Sun  Microsytems  Inc.,  Next 
Inc.,  Dow  Jones  and  Mead  Data  Cen- 
tral. 

In  the  future,  a  special  directory  or» 


Spreading  Information 

The  Wide  Area  Information  Servers  system  provides  a  broad  range  of  information  by 
linking  users  to  many  indepen(Jant  sources  The  information  can  be  in  the  form  of  sound, 
worcis  or  pictures. 
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"white  pages"  will  keep  an  up-to-date 
list  of  all  the  separate  sources  on  the 
network. 

Apple  has  its  own  electronic  library 
project,  borrowing  its  name,  Rose- 
buil,  from  the  movie  "Citizen  Kane." 
The  three-year-old  project  is  based  on 
Ihe  WAIS  system,  hut  adds  features 
including  the  ability  for  a  user  to  de- 
velop a  personalized  electronic  news- 
paper. 

Rosebud  uses  special  programs  — 


called  "reporters"  —  that  let  custom- 
ers specify  the  kinds  of  information 
and  news  they  want  to  retrieve  from 
the  WAIS  system  every  day.  Re- 
searchers at  Apple's  Advanced  Tech- 
nology Group  said  that  in  the  future 
the  necessary  retrieval  software 
might  be  a  standard  part  of  a  comput- 
er's operating  system. 

They  expect  improvements  in  the 
Internet  computer  network  to  greatly 
lower    the    cost    of  information 


searches,  promoting  the  introduction 
of  many  new  services.  The  Govern- 
ment proposes  to  expand  and  im- 
prove Internet  by  financing  a  Na- 
lional  Research  and  Education  Net- 
work, or  NREN,  that  could  extend  a 
high-speed  computer  links  into 
schools  and  communities  across  the 
country. 

"With  things  like  NREN,  everlliing 
could  change  overnight,"  said  Tim 
Oren,  an  Apple  researcher. 
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By  John  Markoff 

New  York  Times  Service 

NEW  YORK  —  The  develop- 
ment of  a  U.S.  data  network  will 
allow  users  of  personal  computers 
to  tap  sources  as  large  as  the  Li- 
brary of  Congress  or  receive  their 
own  personalized  newspapers. 

Several  innovations  have  akeady 
demonstrated  that  searching  vast 
data  bases  can  be  easier  than  con- 
sulting a  card  catalogue,  and  not 
nearly  as  difficult  or  expensive  as 
computer  searches  today. 

Users  with  minimal  computer 
skills  would  soon  be  able  to  search 
through  several  terabytes  of  infor- 
mation —  several  trilhon  charac- 
ters of  text  —  in  seconds.  The  Li- 
brary of  Congress,  with  80  million 
items,  contains  an  estimated  25  ter- 
abytes of  information;. 

Already  an  experimental  com- 
puter library  has  linked  150  univer- 
sities to  40  sources  of  information, 
ranging  from  National  Institutes  of 
Health  data  to  corporate  docu- 
ments and  Shakespeare's  plays. 

New  software  allows  users  to 
browse  or  zero  in.       ■  ■    ■  • 

As  methods  of  retrieving  infor- 
mation are  standardized  and  per- 
fected, industry  executives  and 
computer  scientists  say,  thousands 
of  new  services,  ranging  from  elec- 
tronic newspapers  to  the  computer 
e<^uivaJent  of  free  public  libraries,  _ 
■will  blossom. 
■  "Everyone  is  realizing  how  im- 
portant it  is  to  get  into  the  mass 
market  for  information,"  said 
Thomas  Koulopoulos,  president  of 
Delphi  Consulting  Group,  a  Bos- 
ton market-research  firm. 
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Such  ready  access  to  huge 
amounts  of  computerized  informa- 
tion has  been  the  dream  of  many, 
but  a  lack  of  computing  power, 
.  effective  software  and  high-speed 
digital  networks  stalled  progress. 

If  many  of  the  technical  prob- 
lems are  being  solved,  major  busi- 
nessand  pohtical  disputes  remain. 

The  industry  has  yet  to  find  ways' 
to  protect  and  charge  for  intellectu- 
al property  in  a  computer  network. 

To  encourage  progress,  Thiniang 
Machines  Corp.,  a  Cambridge, 
Massachusetts,  computer  manu- 
facturer, has  made  its  software  free. 


Some  companies,  Uke  Dow  Jones 
Corp.,  that  already  provide  com- 
puterized information  over  tele- 
phone lines,  have  taken  part  in  de- 
veloping the  new  computer  library. 

In  1989,  Thinking  Machines  en- 
listed the  support  of  Dow  Jones, 
Apple  Computer  Inc.  and  the 
KPMG  Peat  Marwick  accounting 
and  consulting  firm  to  design  the 
computer  library,  called  Wide  Area 
Information  Servers,  or  WAIS. 

The  system  permits  computer  us- 
ers to  quickly  search  a  huge  volume 
of  information  even  if  it  is  stored  at 
several  distant  locations. 


The  New  York  Times 

While  current  systems  Uke  Dia- 
log and  Nexis  require  users  to  spec- 
ify precisely,  the  new  system  can 
respond  to  inferences.  It  presents  a 
sample  list  of  documents.  The  user 
chooses  one  or  several,  and  a  feed- 
back program  presents  other  docu- 
ments most  Uke  the  ones  selected. 

"This  solves  the  problem  of  how 
to  get  to  the  information  you  need, 
getting  not  too  much  and  not  too 
little,"  said  Esther  Dyson,  editor  of 
Release  1.0,  a  computer  industry 
newsletter. 

A  computer  user  may  need  to 
See  DATA,  Page  16 
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call  seven  or  eight  separate  data 
bases  depending  on  the  informa- 
tion needed. 

The  WAIS  system  lets  users  of 
Apple  personal  computers  harness 
a  network  of  Thinkmg  Machines 
supercomputers  and  smaller  com- 
puters to  search  data  bases  stored 
by  Dow  Jones,  KPMG  and  several 
corporations  and  universities. 

Users  can  also  read  electronic 


mail,  enter  their  corporate  elec- 
tronic libraries  and  summon  up  a 
wide  variety  -of  documents,  news- 
papers arid  magazines. 

At  Thinking  Machines,  the 
WAIS  system  serves  as  a  corpor-ate 
memory,  allowing  employees  to  re- 
trieve memos,  documents  and  oth- 
er internal  informadon. 

In  1989,  the  United  States  had 
fewer  than  60,000  users  in  the  mar- 
ket for  software  that  allows  the 


rapid  retrieval  of  computerized 
text.  By  the  next  year,  total  sales 
were  about  S120  million.  The  Del- 
phi Consulting  Group  expects  the 
market  to  grow  to  160^000  users 
and  $235  million  by  1992. 

Apple  has  its  own  electronic-li- 
brary project,  borrowing  its  name. 
Rosebud,  from  the  movie  "Citizen 
Kane."  The  project  is  based  on 
WAIS,  but  adds  features  including 
the  ability  for  a  user  to  develop  a 


personalized  electronic  newspaper. 

Rosebud  uses  special  programs, 
called  reporters,  that  allow  custbm- 
ers  to  specify  the  information  and 
news  they  want  to  retrieve  from 
WAIS  daily. 

"Information  retrieval  technol: 
ogy  is  starting  to  spread  from  su- 
percomputers aU  the  way  down  to" 
personal  computers,"  said  Brewster 
Kahle,  a  Thinking  Machines  scien- 
tist . 


